perm filename APPEND.MSS[RDG,DBL]3 blob sn#661837 filedate 1982-06-04 generic text, type C, neo UTF8
COMMENT āŠ—   VALID 00021 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00003 00002	@comment{ @Part<Appendices, root "NAIVE.MSS[rdg,dbl]"> }
C00004 00003	@Appendix(Why Use an Analogy?)
C00014 00004	@Appendix(Multiplicity of Analogies)
C00026 00005	@AppendixSec(An Object Can Have Many Abstractions)
C00032 00006	@Comment{ Another useful case - merge in eventually: Neurological systems like Blood
C00033 00007	@AppendixSec(Examples of Domains with Many Axes)
C00039 00008	@SubHeading(Computer Programs)
C00044 00009	@SubHeading(Musical Works)
C00054 00010	@SubHeading(Conclusion)
C00056 00011	@Appendix(The Nature of Reformulation)
C00066 00012	@AppendixSec(Just What is Reformulation?)
C00080 00013	@AppendixSec{Why Use Reformulation?}
C00095 00014	@AppendixSec<Other Comments on Reformulation>
C00097 00015	@BEGIN<Multiple>@Tag<Sensor>
C00100 00016	@BEGIN<Multiple>@B<Reformulation and Models>@*
C00104 00017	@BEGIN[MULTIPLE]@B<Reformulation is not Always Needed>@*
C00113 00018	@Appendix<Analogy Vocabulary>
C00117 00019	@Appendix<Miscellaneous Thoughts>
C00125 00020	@BEGIN<Multiple>@Tag<InvHier>
C00130 00021	@BEGIN(Comment)
C00141 ENDMK
CāŠ—;
@comment{ @Part<Appendices, root "NAIVE.MSS[rdg,dbl]"> }
@Appendix(Why Use an Analogy?)
@Label(WhyAnalogy)

Chapter @Ref(Examples) spent a ten pages listing a vast array
of different types of analogies.
This indicates that this cognitive ability is amazingly prominent,
but gives no clue why.
Why would anyone would want to use an analogy?
First why take the trouble of generating an analogy --
that is, what does one gain by asserting that A is similar to B for some reason?

One advantage is space efficiency.
The statement
@BEGIN(Example)
A is like B for reason @G(a)
@END(Example)
can encode a great many facts about A 
(and perhaps, about B as well).
This can be achieved in relatively few symbols
by exploiting the conventient fact
that the world is full of similarities,
@i{i.e.}, there will be such Bs which are close to A.
This laconic encoding does have its price --
the hearer H will have to spend time decoding
the relevant facts about A, having to reason "through" B.
This also forces an increased processing complexity onto H's shoulders,
which would not be needed if he had only to retrieve a directly stored
fact.

Why should people be able to understand a given analogy?
One obvious (if circular) reason follows from the fact mentioned above:
people (either yourself or someone else)
will expect you to understand the analogies they are throwing at you.
This activity would be wasteful for both parties,
if you the recipient could not decipher the intended message --
that is, unless you could then infer some new property of one topic
based on some known fact about the vehicle.

It is, of course, rather circular to assert that:
@BEGIN(DISPLAY)
People generate analogies to communicate with people who are able to understand them,
@w<     >@i{and}
People understand analogies only because other people will be generating them.
@END(DISPLAY)

There is, however, another compelling reason for using an analogy:
as mentioned above, the world itself is full of similarities.
Almost any experience/object/phenomenon is, in fact,
similar to some earlier experience/object/phenomenon -- 
with but slight modifications.
(@Cite(Quine)'s claim of the ubiquity of ostention supports this view.)

Given this structure of nature,
it is good bet to conjecture there is some real reason why 
two superficially similar objects are, in fact, related.
Hence, on observing some initial similarity joining a pair of objects,
it is reasonable to infer other connections as well.
This is precisely the type of inferencing
used in the analogical reasoning tasks mentioned above.
Here it is standing alone, independent of any (artificial human) generator.

This analogizing ability may have developed to exploit this
nice "continuity" present in the world.
As before, one advantage is storage space --
one need only store one "copy" of facts about some class of objects,
and notate each instance as a small modification of this.
This is the basic motivation for frame structures. (See @Cite<FRAME>.)
Another advantage is learning:
the ability to readily see a new object in terms of
(that is, as similar to)
some already known thing is the essense of learning and  assimilation.
It is this ability which allows to use our past experiences to
(begin to)
understand some new phenomenon.

Reiterating this point,
analogical inferences are justified in the communication case
because H, the hearer knew S, the sender,
had designed these messages to be decodable --
that S knew there were significant correlations between the two analogues which
H should be able to deduce,
and then use to deduce new facts about B based on known attributes of A.
The justification is quite different (and more meta-physical) in this later case.
Here the connections are reasonable because the world itself
has nice continuities --
when A and B both share some feature P(x),
then it is often the case that A satisfies Q(x) whenever B satisfies it,
at least when the predicate P and Q are (somehow) similar.

In summary, we might propose there are three possible sources of an
analogy-puzzle, which we must understand (or decode):
@BEGIN(ITEM1)
From ourselves earlier - this is used for efficient storage,

From someone else - this is the communication (or metaphoric) sense of analogy,

From "nature" - this is called analogical reasoning or analogical exploration,
(depending on how many hints nature provides us...)
@END(ITEM1)

One might conjecture it was the other non-linguistic use of an analogy
which initially lead to our ability to understand and use analogies;
and it tremendous usefulness for communication which prompted it
ubiquity in our day to day discourse.

@Appendix(Multiplicity of Analogies)
@Label(Multiplicity)

It is very tempting to claim that
there is a single best possible analogy joining any two objects.
This appendix attempts to justify the other view:
that there can be many reasonable analogies linking a pair of analogues.
Which one of these should be used should depend
not only on the analogues,
but also on the overall situation
(@i{e.g.},
@Comment{ the goal of this quest,}
who is asking the question and why).
In each of the following cases
several possible analogies can be drawn between two objects;
each of these is reasonable for certain applications,
and clearly inappropriate for others.

One important purpose of this appendix is to counter the 
(distressingly prevalent)
position that the phrase
"the best possible analogy" is well defined.@Foot{
Consider the
many current AI systems which use a single fixed set of ordered features
to define (the goodness of) any analogy.
This means that such programs can find 
at most
one analogy linking two analogues.
The designers are apparently claiming that that one result is sufficient.}
These examples also motivate the claim
that no single formulation (of the analogues) can be sufficient
to find any possible analogy.
This leads naturally to
an approach to solving this problem called reformulation,
which is discussed in the next appendix.

@AppendixSec(An Object Can Have Many Abstractions)

There are many ways of looking at an object.
In each perspective,
certain features are emphasized, and the rest of downgraded or ignored.
Consider @i{Washington}.
Historically, he is known as the first president of the country.
It is also true that his likeness is portrayed on all paper one dollar bills.
Knowing these two facts,
what is the "correct" answer to the simple proportional analogy
@BEGIN(QUOTATION)
Washington : 1 :: Lincoln : @i{?}
@END(QUOTATION)
As both Washington and Lincoln were U.S. presidents,
perhaps the first perspective should be considered.
Here the "1" obviously refers to the ordinality of Washington's presidency;
hence the answer must be @i{? = 16},
indicating that Lincoln was the sixteenth president.
Of course our more monetarily oriented friends will have a different opinion.
Clearly @i{?} should be @i{5},
as Lincoln is portrayed on the 5 dollar in the same manner
that Washington's likeness is on the 1 dollar bill.
(Or perhaps @i{?} should be @i{9}, @i{i}'s position in the alphabet,
based on the fact that the second letter of "Washington", "a", is the @i{1}th
letter, and the second letter of "Lincoln" is this "i".  Or perhaps ...)

Which is correct?  
They both are -- 
each analogy is apt, in its own way.
The criteria for deciding what makes a good analogy is quite 
personal and subjective,
based on the particular biases and orientations of the reasoner,
at this time.@Foot{
To simplify the situation,
this appendix will focus on the analogy-generation process,
and not deal with the use of an analogy.
Of course understanding an analogy is also quite subjective.
There the orientations of two individuals must be considered:
the understander, in the job of "decoding" the message,
must take into account the generator (human or nature)
who initially suggested this analogy.
Hence the biases of both parties are relevant, and must be noted.
@Comment(See Note @Ref<OnlyGen> in SubAppendix @Ref<ReformOther>.)}

As noted above, each of these views 
(Washington as first president and as face on one dollar bills)
maps the object (represented by the term) Washington
onto some particular subset of relevant features.
In what follows we will call each such object
(or model) to theory mapping an @i{abstraction}.
(This term will be further defined in the footnote @Ref(DefAbst),
and its use will be illustrated in SubAppendix @Ref(ReformOther).)
What makes analogizing difficult is that any given object can have
many different abstractions,
and any of these abstractions can be used in generating an analogy.

The general analogizing routine, therefore, seems to involve first
finding the appropriate abstraction for the analogues,
and then matching across corresponding features.
This "appropriate abstraction" will depend on a variety of other factors,
in addition to the analogues themselves.
(See both Properties @Ref(Subjective) and @Ref(ContextDependent)
in Appendix @Ref<Properties>, and the vocabulary mentioned in 
Appendix @Ref<Analogy-Vocab>.)
The following Appendix @Ref<Reform>
makes the claim that such abstractions may have to be generated "on the fly" 
-- contrary to the assumption that one needs only employ some pre-established
list of possible abstractions.
(This requires reformulation -- which basically maps from one abstraction onto
another.)

@Comment{ Another useful case - merge in eventually: Neurological systems like Blood
Blood is probably regarded as a fluid.
Hence finding some connection between blood and another fluid, like Lymph,
is not surprising.  They clearly share a variety of properties, just by
virtue of their watery nature.
But what about neuro-logical systems?  Such systems are not liquid, but they
still have curcuit -- here electronic rather than material flow. <other props 
in common?>
}
@AppendixSec(Examples of Domains with Many Axes)

This subappendix repeats the theme that many different analogies
can be proposed for the same pair of analogues.@Foot{
If you already believe this claim,
feel free to skip the remainder of this appendix.}
Here we show several cases where a given object
can be classified in many different ways,
and demonstrate how each of these categories can lead
to a different analogical connection.
This means that different analogies can be proposed between the same
pair of analogues,
depending on which axes were used for describing the objects.
(These axes, in turn, depend on the ultimate purpose for this analogy).
These objects are from the three domains:
Shakespearean works,
Computer programs
and Musical pieces.

@SubHeading(Shakespearean Works)
Consider first the plays which were written by Shakespeare.
Ask, in particular, what are the axes by which to compare them?
Clearly features such as deaths, betrayals, loves and the like can be used.
(The program described in @Cite(Winston) defined its similarity metric on
just these, relatively low-level relations.)
Shakespeare himself appeared to use another, higher feature space to pair his works 
-- for example, he often matched a tragedy with a comedy 
(consider "Love's Labor's Lost" and "King Lear").
He also employed parallel, analogous sub-plots within the same play
-- compare Gloucester and his sons with the king and his daughters in "King Lear".
At a yet larger level,
we also expect that different cultures will employ different sets
of salient features when comparing plays.
(The monograph @Cite(Shake-Bush) elaborates on this point,
discussing how different cultures interpreted Hamlet,
emphasizing, in particular,
Hamlet's interaction with his father's ghost.)

The diversity of feature spaces can also be seen in some analyses of 
Shakespeare's sonnets.
While most reviews consider themes or overall semantic structure,
@Cite(HalletSmith)'s analysis is down at the level of the lexicon.
He observed that a particular group of words were used in both some sonnet,
and a particular passage of the Bible,
and inferred that Shakespeare was (either consciously or unconsciously)
connecting this work with the biblical passge.
(There are many others levels
-- some scholars critically examine 
rhythmic contours, or morphological sound patterns.)

@SubHeading(Computer Programs)
One obvious way of comparing computer programs is in terms of their behaviour.
Using this criterion, any pair of sorting routines would be considered analogous,
independent of assymptotic run time, or data structures used.

But this is only one dimension.
Clearly Mycin (@Cite<Mycin>) and SACON (@Cite<SACON>) are similar,
even though these two functions do totally different things,
on totally different sets of arguments.
A further examination reveals the underlying similarity:
both are based on an EMYCIN inference engine, and employ data structures
appropriate to such processing 
(@i{e.g.}, production rules, context trees, CF's, ...).

Given such examples,
any "program abstracting index" should include not only I/O behavior,
but also underlying model as well.
Then the learning systems 
MetaDendral (@Cite<MetaDendral>) and Lex (@Cite<LEX>) will be found analogous,
as both employ the same Version Space method.
But what about MetaDendral and Winston's Arch-finding programs (@Cite<Arch>)?
How do you say that both use some type of learning scheme?
This is clearly a new and different dimension 
-- @i{viz.}, type of application.

Tarjan discovered a (now familiar) similarity between
the existing "Fischer/Galler efficient garbage collection algorithm"
and the "Kruskel Minimal Spanning Tree algorithm".
This commonality was not at the level of I/O,
as these programs deal with different inputs
nor a single "generating procedure",
nor identical data-structures. 
Both, however, use the same clever path-compression trick,
(applicable to the general case of Union/Find operations,)
which causes the overall algorithm to speed up during subsequent iterations.
So now we include a category of "cute tricks employed" as well.

After adding this someone comments that RLL (@Cite<RLL>) and Eurisko 
(@Cite<HEURISTICS>) are similar.
They have different I/O behaviour, and are for different types of applications,
and employ (almost) none of the same tricks.
However, (owing to a common ancestry,)
they do have some of the same variable names,
and are both in a CLISP-less version of InterLisp (@Cite(InterLisp).)
Ah, yet another new dimension.  

And so on, and so on...  Other dimensions include
@BEGIN(ITEM1)

author of programs

school (where it was designed)

programming style

data (or control) flow analysis graph of the program

@END(ITEM1)

The moral is, no matter how many ways you describe a program,
there can always be other descriptions.
Furthermore other people will find these new dimensions
so obvious they will effortlessly base analogies on them, which they
will expect you to follow.
As such, to understand analogies in general,
the hearer must be able to readily adapt to someone else's decomposition.

@SubHeading(Musical Works)
The final class of examples center around the issue of similarity in music.
What does it mean for someone to tell you that
he finds two particular pieces to be similar?
It could be because they
@BEGIN(ITEM1)
share the same sequence of notes

share the same melodic contour

share the same rhythmic pattern

share the same frequency range

were written by the same composer
@Comment{ for similar composer:
recall the fury surrounding Mehta's performances of any Wagner work in Israel.
Imagine another composer had similar facist views...}

were written during the same period

are from the (geographic) region

are in the same key

reflect (or express) the same feelings (@i{e.g.} sadness, elation, @i{etc}.)

are performed on the same set of instruments

employ the same bowing/tonguing/... technique (by the performer).
@END(ITEM1)
Of course "same" can be replaced with "similar" throughout the list above.
Notice further that this list includes only reasons which
an arbitrary hearer might be able to "spontaneously" understand,
unaided by extended description.
This list could otherwise be extended to include
more specific and detailed connections,
which the speaker would probably have to explain; for example,
performed on the same night,
contributed to the score of the same musical, @i<etc>.)

In general, any of the above list might be his intent.
One naive strategy for deciding what he meant would involve
scanning this list (or some similar one),
searching for the abstraction which both these pieces share.
The next appendix first elaborates this analogy recognizing/using scheme,
and then proposes a different system which does not have these dependencies.

These comparions occur quite frequently -- both explicitly and implicitly.
For example,
consider how someone might approximate some piece,
@i{e.g.}, to hum it to someone else.
A rhythm conscious person might
"abstract out" the pitches and tones from the piece,
leaving only the rhythm which he could percussively tap out.
Another person might more or less ignore the rhythm,
and hum the melodic contour of the work.
(Certainly anyone with perfect pitch would do this.)
A twelve-tone serialist (@i{a` la} Schoenberg) would "understand" this piece
in a yet different manner 
-- noting the mathematical relations among the notes themselves,
disregarding "higher" connotations like "feelings", @i{etc}.
A violinist might perform a piece holding true to the bowing motion,
even if this meant altering both nuances of the rhythm and the notes themselves.
Which of these renditions is most analogous to the original work?
This clearly depends on the person with whom you are speaking,
and on your mutual goal.
(Who but another violinist would understand the derivation of the last example?)

Continuing on the theme of what sounds most like some piece, consider
how different instruments would perform the same work.
A violist might play the same piece quite differently from a violinist,
as each would use those techniques
which best emphasized their instruments "personality".
A lutist would probably perform it in a renaissance style,
while a floutist would probably perform in a more contemporary style.
Which is most apt?

Music is full of analogies.
Throughout a work the composer will constantly return to a theme,
only it will be slightly different each time.
Which repitition is most like the original theme? 
Or consider the case where a musician is improvising (or jamming) on the some theme.
Compare 
Purcell's "Abdelezor" with Brittin's "A Young Person's Guide to the Orchestra".
Both are based on the same theme, but the "ornamentations" are quite different.
(Realize again how meaningless it is to ask whether 
that "Abdelezor" is more like some other work by Purcell,
or "A Young Person's Guide to the Orchestra".)

Consider now performing a work --
even given a score,
there are still many ways of performing any piece.
Here the issues of styling arise --
we might then consider two performances similar if they were both played
in a Yehudi Menuin-ish manner, or if von Klemperer conducted both.
(See Example @Ref(Style).)
Is the styling used in performing a piece more important than its score itself?
Again, this depends on what the overall basis for the comparison is.

@SubHeading(Conclusion)
These examples have (laborously) demonstrated the fact that,
to understand an analogy,
one must know why it was generated,
in addition to the analogues.
Understanding the analogy may still be problematic, even with that knowledge --
the speaker may couch the analogues in terms unfamiliar to the hearer.
To appreciate the connection, the hearer may be forced to re-examine
the analogues, searching for an alternate representation of the objects,
which is resonant with the speaker's.
This re-examination and re-representation is the basis for reformulation,
the topic discussed in the next appendix.

@Appendix(The Nature of Reformulation)
@Label(Reform)

We have bantered the term "reformation" about
all through this paper;
this appendix will actually specify what this term means,
and tell why we consider it important.
The first part overviews what reformulation is, and indicates how
it could fit into the analogy understanding process.
The second subappendix 
enumerates several subcases of reformulation 
to provide a more precise definition of this process.
The next subappendix presents some arguments illustrating
the necessity of reformulation for such tasks.
The concluding part lists some addditional thoughts about reformulation,
emphasizing its relation to analogy.

@AppendixSec(Analogy = Reformulation + Match)

As we claimed above, 
(in Example @Ref(ProblemRestate) and in Section @Ref(AppElab),)
the trick to understanding a given analogy, 
(by computer at least)
is reformulation.
In a nutshell, this means that the hearer may first have to translate
each of the analogues from his initial representation,
into another equivalent representation.
If successful, if the new representations are appropriate,
the analogy will "fall out" as a simple parameter adjustment.

The prior appendix demonstrated that
a single pair of analogues can have many different analogies,
each based on some abstraction@Foot{
@TAG(DefAbst)
Here we take @i<abstraction> to mean a partial theory of the analogue.
For our applications,
it is synonymous with repesentation.
This usage is consistent:
Any representation can present only a 
finite(ly generated) set of features of the model,
and hide the rest.}
of the analogues.
For example, Washington is similar to Lincoln in that both were presidents,
or because both have portraits on some U.S. bill.

People usually (subconsciously) consider one abstraction of an object
more intuitive than the rest,
and therefore attempt to use only this representation when comparing this object
to others.
(@i{E.g}, some will consider historical facts whenever possible;
in particular, preferring such information over facts about currency.)
This approach motivates a naive analogy-understanding algorithm, 
which uses only these representations when matching.
As we found above,
there is no guarantee that
this initial, @i{a priori} best representation will be appropriate,
that is, that it will lead to the desired analogy.
In these cases the hearer will have to drop that first abstraction
and re-examine the analogues themselves,
searching for a decomposition which was more appropriate to this particular problem.

One refinement of this approach is based on the realization
that any given object can have many possible abstractions,
each apt for certain cases.
Understanding the analogy, here,
involves first producing the range of abstractions for each analogue.
The abstraction most apt for this situation is then selected;
and this pair of abstractions (one from each analogue) are then compared.

People seem to use a slight modification of this process,
where much of the pruner's smarts is in the producer:
Here the possible abstractions of the objects are generated 
(and compared) one at a time.
This iteration stops when a pair of representations match.
(Hence we might first consider Washington as president.
If that fails, consider him as the figure on a dollar bill.
If neither of these worked, one might go on consider things like
Washington as the nation's capital, or as a state of the union,
or as a pun for "wash a ton", ...
While this is going on, the analogue is also being described in various
languages, ...)

The order in which these abstractions are generated would be
guided by some heuristics which estimate "likelihood of match".
These rules would consider things like the other analogue,
(or rather, its current best representation
-- introducing a seemingly unavoidable circularity,)
the goal of this analogy, facts about the speaker, @i{etc}.@Foot{
@Cite<Carbonella>'s "invariance hierarchy" scheme has this overall flavor.
This system is based on a (purportedly) universal "abstraction ordering scheme",
which is a list of possible roles 
(here considered as possible invariants) of each analogues.
A quick description of the implied algorithm,
together with a list of its positive features and its limitations,
appears as Item @Ref<InvHier> below, in Appendix @Ref<Misc>.}

These approaches assume that it is possible to derive 
new and different abstractions from the model as needed;
@i{i.e.}, that the model itself is available.
Unfortunately,
this is impossible for any (real world) phenomenon --
all we can possibly have are the approximations produced by our limited sensors.
(One of the issues discussed in the SubAppendix @Ref(ReformOther) argues
that we are always dealing with abstractions rather than models.)

This means the best any analogizer 
(whether machine or human,)
can do is re-examine the current abstraction, attempting to find some new
representation which includes the relevant characteristics.
This re-representation is reformulation.

@AppendixSec(Just What is Reformulation?)
@Label(ReformCases)

Reiterating, reformulation changes the @i{description} of a problem.
In analogy cases,
this requires re-representing the analogues.
This subappendix lists several ways of redescribing the analogues --
each demonstrating a subcase of reformuation.
(In the style of the rest of this paper,
this list will be a fuzzy, naive first pass.)

@BEGIN(ITEM1)

@BEGIN(Multiple)
The current abstraction is sufficient, but awkward.@*
That is, everything needed is given in the current representation,
but the problem is still difficult to solve.
Below we list several different types of reasons for this difficulty.

@BEGIN(ITEM1)
The current abstraction includes too much.@*
Here the relevant features are buried in among other facts.
Pruning away these unneccesary facts leaves a simple problem,
whose solution is straightforward.
The best (only?) example of this is @Cite[Amarel]'s Missionary and Cannibals.
(Note that the form of the M&C problem was altered
as the "content" of the problem was decreased.)

@BEGIN(Multiple)
The relevant features must be derived/explicated.@*
Here we need to first derive other features for one or both of the analogues.
If done correctly, the new representations of the two analogues will be almost
identical, differing only in the value of that derived feature.
(In RLL (@Cite[RLL]) vocabulary, this means adding on a few new computable slots.)
Recall the "biological trees are like corporate hierarchies" analogy,
mentioned in @Cite[M&M].
This analogy is hard to see (and harder to justify)
if the tree is represented using only the BranchesFrom relation,
and if only the BossOf relation was used for the corporations.
However, we could reformulate the problem, by adding in
the transitive closure of both these relations.
We would then quickly notice both were acyclic,
and each determined a single maximal element.
At this level, it is easy to see that this BranchesFrom* relation
should map onto the BossOf* relation.
This, in turn, leads directly to the desired analogy.

As another example,
one can imagine situations where the day of the week was very relevant
in understanding an analogy,
but all we knew about each analogue was its date.
Of course there is already enough information to deduce the day,
and once this new property is computed, 
one relevant of the analogy would fall out.
(One could similarly imagine situations where the date on the lunar calendar
was more important than the solar date, @i{etc}.)
@END(Multiple)

New objects must be constructed.@*
This is very similar to the case above 
-- here, we strive to find new entities 
(rather than new features of the old entities),
with which the solution falls out.
Consider the addition of new lines or points to a geometric proof.
(Many times these will reduce a proof to a triviality.)

The problem itself must first be transformed.@*
Here, we must transform the given problem into a similar one,
whose solution leads to the solution to the given problem.
(See Example @Ref(ProblemXForm).)
@END(ITEM1)
@END(Multiple)

@BEGIN(Multiple)
The current abstraction does not include the relevant facts, in any form.@*
Here one must re-examine the analogues themselves to find the analogy.
For example, there is no way to deduce the author of a program
from the code of a program alone,
or the rhythm of a musical piece from its melodic contour.
(As mentioned above, this is technically only possible for artificial objects
such as mathematical theories @i{qua} theories,
about which everything can be known.
In real world cases, 
we can derive a new abstraction of the object,
(say a given tree,)
by re-examining it,
looking now for certain different features.
Of course we may still fail, due to limitation of our sensory apparatuses.)

If the analogue itself is unavailable 
(@i{e.g.}, forgotten as soon as this abstraction was derived, )
the best the reasoner can do is
use various heuristic methods to (try to) derive
these other abstractions from the current abstraction.@Foot{
The other case, when the abstraction was sufficient,
also relied on a fair amount of guesswork.
There, however, the guesses would tell which transformation to apply
-- @i<i.e.>, which new slots to generate, or which other problem to consider.}
@END(Multiple)
@END(ITEM1)

@AppendixSec{Why Use Reformulation?}
@LABEL<WhyReform>

Many people will argue that reformulation, while cute, is not needed.
Their claim is often based on the assumption that there is some universally
accepted feature space,
which all people (and hence any possible analogizing system)
automatically use.
Hence, they would argue, there is never the need to change the representation
of the problem.

This subappendix presents several counter-arguments.
The most obvious is based on the complexity of the world --
there are many orthogonal ways of categorizing any object/event/episode/...;
and each can lead to a different set of analogies.
For example,
Appendix @Ref<Subjective> above listed the vast number of
ways one can describe a musical piece.
(Again, each of these descriptions included
certain relevant features of the music, 
and ignored the rest.)
Any of these description could be used to define an analogy --
that is, two musical pieces could be similar in any of these ways.
Realize there is no @i{a priori} reason to prefer one of these over the others
 -- each is the most applicable description, for some task.

One might still claim there are a large, but fixed,
number of decompositions of any object.
Analogizing would then involve selecting one of the object's descriptions,
(which could all have been generated during a "pre-processing" step,)
and matching along these dimensions.

We will counter this argument with a stronger claim:
No single, static collection of descriptions 
can be sufficient to find any arbitrary analogy.
Worded another way,
we feel that any competent analogizing must be able to dynamically
reformulate an input problem.
There are several reasons.
First, @Cite[Darden?] cites several cases
where more than one analogy was required to solve some problem.
Each of these contributes but one part to the overall solution;
no one of the provided representations of the problem, alone,
was sufficient.
(Here each transformation was required to map one part into a different form.)
Even providing all possible combinations of all possible partial descriptions
is not sufficient:
these new representations would have to be added to the pot of
representations, and thence further adapted.
(Perhaps this process would eventually converge 
-- but this would require a gigantic collection of representations.)
A smaller, conceptually easier solution is to provide some core nucleus
of representations, and some mechanisms for combining them.

Introspecting while problem solving will reveal another powerful argument
for reformulation:
people often notice a new (@i{i.e.}, previously unrealized) commonality
while solving the problem. 
(See Dimension @Ref(GenVsFind).)
It seems unlikely that such an abstraction could have been stored @i{a priori},
as the generator himself was previously unaware of this relation.

@AppendixSec<Other Comments on Reformulation>
@Label<ReformOther>

This subappendix lists some additional comments on the nature of reformulation:

@BEGIN[ITEM1, SPREAD=1]
@B(Only for Understanding)@*
In general, the speaker can employ the "obvious" representations for the
analogues when generating the analogy.
The only reason he
might reformulate the analogues is out of respect for the eventual hearer,
who will be attempting to understand the analogy,
and who may not be familiar with the initial particular "feature space"
(which the speaker considered sefl-evident).
Section @Ref[AppElab] expands this theme.

@B(Similarity @i{vs} Proportional Analogies)@*
The competent use of reformulation blurs the
distinction between similarity and proportional analogies.
One traditionally has an explicit handle
on the difference between the two analogues in
proportional analogies, and not in similarity analogies.
However, the purpose of reformulation is to produce just such a handle --
to explicate the difference between the analogues,
which we above called a changable parameter.

@BEGIN<Multiple>@Tag<Sensor>
@B(Myth of Explicit Model)@*
One never really has access to the full model (of an analogue),
at least not to any real world object.
First, our sensors (primarily eyes and ears) present only an approximation to
the object itself --
they note only certain types of characteristics of the object,
(@i{i.e.}, visual facts, and not, for example, the full ontogeny of the object,)
and even then, they respond only to certain limited signals
within this space of possible information
(@i{i.e.}, only to certain electromagnetic wavelengths).
The purity of this image is further compromised by our higher level processing,
which biases our perception in certain ways.
(Consider any of the optical illusions which @Cite(Gregory) discusses.)
A third difficulty arises from the imperfection of our memory --
even had we recorded every possible pertinent fact,
we still might not be able to recall the ones needed
to flesh out an analogue.

All of these limitations apply in spades to any computer program which
attempt to understand or derive an analogy.
Its "primary sensory perceptions" is yet even more problematic --
as all of its data must have already passed through a human filter.

The fact that these problems also apply to humans
helps to answer the claim that 
no such sensor-less individual could possibly generate a "new" analogy --
that is,
that @i{de novo} creation of a new analogy by a computer is impossible.
Surely any analogy that it generated must have been somehow encoded within
the program's code and input.
As the starting premise of this item indicates,
we humans have similar limitations --
the mere presence of sensors does not guarantee that pure and total descriptions
will be generated.
Hence, the argument that no machine could ever produce a new analogy
can be readily transformed to apply to humans...
@END(Multiple)

@BEGIN<Multiple>@B<Reformulation and Models>@*
One of the major limitations of many existing AI analogy programs 
is their commitment to a
single particular set of domain characteristics,
and a single representation scheme.
@Cite[ThesisProp] proposes some simple reformulations which
permit new features to be defined, (based on existing ones,)
which begins to remove this dependency on the user's initial selection.

This approach defines analogy in terms of abstractions:
two objects 
(read models)
are considered analogous if they both satisfy the same partial theory.
Note this differs from the standard approach,
in which two objects are deemed analogous
if their @i{REPRESENTATIONS} match syntactically.
The problem with that approach arises from the fact that
a given model (@i{e.g.} a particular problem) may have many representations;
and the particular representation selected @i{a priori} may not be well suited
to finding the appropriate analogy.  
(Recall, for example, the arguments used above for the multiplicity of analogies.)
This motivates the usefulness of reformulation:
Problem reformulation changes only the representation of the problem,
without modifying the problem itself.
This allows us to find a new representation in which the underlying connection
linking the analogues is obvious,
and know that we are still be dealing with the same problem.

The common partial theory view of analogy readily permits reformulation.
With this definition of analogy,
a problem X and its reformulation will always be analogous.
Hence the particular representation of an object is irrelevant.
We can therefore reformulate the descriptions of two objects,
knowing this will not affect whether the objects themselves are analogous.
The reformulations may, however, explicate certain desired features;
thereby reducing part of
the analogizing process to simple syntactic matching.

Note that this approach allows us to 
concentrate on the problem itself,
ignoring many of the syntactic features
which are prominent when comparing the representations of the two analogues.
Included are things like names of variables,
or number of arguments for a given relation.
(That is, we can feel free to add additional arguments to a relation.)
@END(Multiple)

@BEGIN[MULTIPLE]@B<Reformulation is not Always Needed>@*
Of course there will be some cases where the formulation given is sufficient.
(Realize this is NOT a function of the problem, @i(per se), only of its encoding.)
Recall the "5p to 3p" analogy.
This problem was given using a descriptive language in which
each command is regarded as a string of sequential characters.
In this language this analogy is but a simple parameter value change 
-- that is, no (non-degenerate) reformulation is needed.
However, imagine now these commands were each viewed as a bit vector,
or if the editor required Roman numerals.
If those cases still seem too straightforward,
we could make this silly editor use one command format to visit
the @i<p@+{th}> page, when this @i<p> is prime,
and another when @i{p} is composition.

Here @u{2,3,3p} would move to the @i{18}@+{th} page,
while @u{19P} would move to the @i{19}@+{th}.
If we used the obvious representation for this commands
(@i{i.e.}, the string composed of the tokens 
@u{2}, @u{,}, @u{3}, @u{,}, @u{3} and @u{P} for the first command,
and of @u{19} and @u{P} for the second,)
then we would need a reformulation step to translate the first
(pre-@i{p}/@i{P}) parameters to the number represented,
and disregard the "upper-casedness" of the following character 
(that is, whether it was @i{p} or @i{P}).
Of course this reformulation is unnecessary if we already stored
these commands in that type of format --
in essense, performing this transformation as the command is being described.@Foot{
The artifactual-ness of the domain of editors makes any problem
which requires a major reformulation seem silly.
Given that editors were designed for people to use,
their linguistic front-ends should conform to natural ways of viewing the problem.}
@END[MULTIPLE]

@END(ITEM1)
@Appendix<Analogy Vocabulary>
@Label<Analogy-Vocab>

Here we consider what consitutes a natural description for 
constraining how two things are analogous.
In many cases, these would not need to be stated explicitly;
cases this additional information can often be understood implicitly.
The descriptions below can also be used to (partially) define the analogy --
that is, they can indicate how the analogues map into one another.
(See note on page @PageRef<ConstraintReason>.)

Fred is [just] like Jill
@BEGIN(ITEM1)
@ux(in that both are) students.@*
Or: @ux(as both are), @ux(regarding both to be), ...@*
@i{I.e.}: both are in the same category or class; and this comparison should
be based on this perspective.

@ux(in terms of) profession.@*
Or: "<Adjective>ily, A is like B", as in "Cognitively, people are like computers."@*
@i{I.e.}: both are in the same category, based on the <Adjective> perspective.

@ux(except that) Jill is female.@*
Or: @ux(were it not that), @ux(disregarding the fact that), ... @*
@i{I.e.}: Some implicit, unstated slot [here gender] has a different value.

@ux(ignoring) gender.@*
Or: @ux(excluding), @ux(disregarding), ... @*
@i{I.e.}: The value of their respective X slots [here gender] are different.

@ux(considering only their) height. (or other property)@*
Or: @ux(based only on), @ux(considering just), 
@ux(where) X @ux(are the most salient attributes.)@*
@i{I.e.}: The value of their respective X slots [here height] are the same (or similar).

@ux(after substituting) CalTech @ux(for) Stanford.@*
Or: @ux(except in) X @ux(context rather than) Y, 
@ux(but deals with) X @ux(rather than) Y, ...@*
@i{I.e.}: There is a strong (@i{i.e.g.}, causal) reality to the 
proportional metaphor@*
@w(    )Fred:Jill :: CalTech : Stanford@*
There is some relation joining Fred to CalTech holds for Jill and Stanford.

@ux(as) Prince Charming @ux(is to) Cinderella.@*
Or: @ux(ala) Prince Charming @ux(with respect to) Cinderella, ...  @*
@i{I.e.}: (At least) one of the relations joining Prince Charming to Cinderella
links Fred to Jill. Perhaps "loves enough to search for"?
(This is much like the case above.)

@END(ITEM1)

It is pretty easy to see how this maps onto the similarity case.
The trick mention in footnote 
@Ref<Unify>
@Comment{ This does NOT work - not until footnotes are handled properly:
on page @PageRef<Unify>
<<Maybe this sould be @Value(Unify) ???>> }
discusses how this relates to the proportional case.
@Appendix<Miscellaneous Thoughts>
@Label<Misc>

@BEGENUM1<>
@B(Analogical Inferences)@*
Perhaps we should define a particular set of analogical inferencing steps.
For example, we could define a rule of inference, such as
@BEGIN(DISPLAY)
P(A)
R(A,B)
@g(p)(P,P')
@g(r)(R)
------------
P'(B)
@END(DISPLAY)
where @g(p) and @g(r) are appropriately defined second-order predicates.
We are instead using just standard predicate calculus,
with the usual inferential steps.  
(We do reserve the right to embellish this later.)

@BEGIN(Multiple)
@B(Precision of Analogy)@*
George Polya,
(whose books and teachings have done much to legitimize the use of
heuristic methods like analogy,)
made an interesting point about induction.
This term, he (sadly) observed,
has taken a rather precise meaning in mathematics:
It originally refered to a large array of conjecture-generating operations,
all using some form of generalization.
Mathematicians then refined the term "induction" to its current state,
referring exclusively to a particular, provably valid inference step.
While he agreed this last sense was useful,
he also felt the other senses were still quite worthwhile 
(and indeed, essential for activities like discovery).

Analogy may be in a similar position.  
There is the fear that the formalizing steps presented in this paper
may serve to mechanize this conjecturing process to the same extreme.
We should therefore attempt to clarify our goal --
we are trying to demystify some aspects of this process,
not reduce it into a single algorithmic process.
It will still remain a heuristic method, applicable only when
more powerful and more nearly guaranteed methods are inapplicable,
or have failed.
@END(Multiple)

@BEGIN(Multiple)
@B(Lakoff's Position)@*
@TAG(Lakoffette)
@Cite(Lakoff) lists many cases where an entire collection of terms were
taken from one application and applied to another.
It is necesary for both the topic and vehicle
(using the nomenclature defined in @Cite(Paivio))
to satisfy a common theory (there called "ground".)

One might conjecture that we people have some
special purpose internal hardware which has been "evolutionally"
tuned to handle certain common situations (eg Up versus Down),
which is here being applied to other cases which require the same type of reasoning
-- here, for any linear ordering, such as degree of happiness.
This process permits efficient reasoning, based on "hard-wired" routines.
The alternative approach requires reasoning from the underlying theory.
(Here, from facts about the less-than relation.)
Hence, efficient algorithms, developed for one model,
can be applied to a different model.
The only requirement is that the new model satisfies the same theory.
(This theme is further exploited in @Cite(ArchMRS).)
@END(Multiple)

@BEGIN(Multiple)
@B(Psychological Data)@*
The results reported in @Cite<Learn-HR>
are relevant to the idea mentioned in Section @Ref(AnalQuests).
This research demonstrated
that people tend to clump together certain sets of features,
rather than just deal with features on an individual basis.
This suggests that people would tend to map from one feature space to another,
rather than just from one individual property to another.
Hence it might make sense to search for the closure of
a set of preliminary constraints;
where this final cluster of constraints,
when "applied" to the analogues,
would construct the analogy for that situation.

<<Are there other useful tidbits?>>
@END(Multiple)

@BEGIN<Multiple>@Tag<DvsP>
I considered seperating this "Deductive/Predictive" application into
two categories,
"Deductive" and "Predictive"
(depending, of course, on how confidently R can determine B's new property).
There were three reasons I rejected that further partitioning.
First, this seems more a continuum than a simple split 
-- ranging over the degree of certainty in R's mind.
Second, there are several other criteria which could be used for splintering
this reasoning case.  
There seemed no (non-arbitrary) way I could allow this seperation,
but not those.
Finally, this could force a similar bifurcation in the "Linguistic" case,
seperating the case when H could positively deduce the new property
about B, from when he could only guess.
In addition to the problems above,
this would also permit various pragmatics issues to creep in:
perhaps H could claim that he knew enough
about S that he knew what S was really thinking,
to a sufficient detail that this metaphor was transparent.
If possible, I would prefer to avoid opening this cans of worms...}
@END(Multiple)

@BEGIN<Multiple>@Tag<InvHier>
@B<Comments on Carbonell's "Invariance Hierarchy">@*
This scheme defines analogy as the first case
where both analogues share a common invariant.
(Some examples of these invariants are presented below.)
The implied analogy-understanding algorithm could generate these abstractions
sequentially,
testing them one by one, and return the first such common role which passed.

Thus the algorithm would first see if both analogues shared the first invariant --
which is true if both (represented animate objects which) 
share a common "goal expectation setting".
If so, this common expectation would be returned as the most appropriate analogy.
Otherwise, the analogizer would investigate whether both analogues
involved the same type of planning strategies.
This would mean they share the second invariant,
which would be returned as the basis for the best analogy of these analogues.
And so on, through the eight other types of matches.

Each of these invariants is an abstraction of the analogue.
Thus the program is mapping down a certain set of abstractions,
stopping as soon as a match is found.

There are several complications and limitations with this scheme.
One confusion derives from the fact that there are
two different categories of property types used to define an invariant.
The first class are properties of the object itself,
such as the temporal ordering invariant.
The second are features of the representation,
(and not the object @i{per se},)
such as "descriptive properties", which refers to unary predicates.

A second issue is the claim that these ten types of invariants are sufficient.
It is never explained why this particular list is considered 
either appropriate, or exhaustive.
While any particular characteristic which could somehow be tagged,
(using, if nothing else, the "descriptive properties" catch-all invariant,)
there is no reason 
(beyond some weak introspection and head-nodding)
that these ten types of invariants
reflect the most relevant partioning of the space.

Another complication with this scheme is the issue
of how to label a particular characteristic with the appropriate "invariance type".
This task seems at best quite difficult, and possibly too ill-defined
to be performed.

Modulo these problems,
many aspects of this approach do seem plausible and usable --
this break-down does reflect a set of nice set of heuristics,
which could be used to rank the aptness of an analogy.
Carbonell has, in fact, outlined some reasons 
(based on intuitions and introspections,)
explaining why this particular search strategy is effective,
and why people do seem to use it.
[Carbonell, @i<person conversation>]
@END(Multiple)

@ENDENUM1<>
@BEGIN(Comment)
@BEGIN(Multiple)
@B(Pragmatics)@*
@Tag(Pragmatics)
This item will mentions some comments about the nature of pragmatics,
emphasizing how this event related to metaphor.
(In particular, to explain why I did not want to split the linguistic case...)
<<here>>
Note he need not actually communicate this connection to H.
If not then H has even more work...  See below.

This need not be the case:
Consider the "electricity resembles fluid flow" example.
Even if S did not realize the actual connection between these systems
he might still issue this statement,
confident that the physicist H will 
-- i.e. that H will still "understand" the intended meaning,
even though S's message only suggested it.
There are still other cases --
e.g. when S is technically wrong, but H can still get the intended message ...
@END(Multiple)

@BEGIN(Multiple)
@B(Analogy in Music, Revisited)@*
Consider what it means to say that two musical performances are of the SAME PIECE,
given that every performance is different?
Consider the similarities and differences
between two conductors' 
(instrumentalists')
renditions of the same score, or two playings of the same recording.
Clearly there is some type of analogy connnecting these...

Music is now more "Masculine" (sharp, articulated) rather than
the earlier more "feminine" style - soft flowing 
(Note those terms are metaphoric!)}
@END(Multiple)

-- further example of use of correct formulation --@*
As an example,
consider now the water flow/electrical curcuit analogy.
One might view each of the variables in the
various equations associated with the water case as a unit,
with a slot which pointed to the instantiation of this term --
as in junction, or resoviour(sp), or pipe cross-section.
In this representation, changing from water to electricity requires nothing
beyond substituting wire contacts,
battery, and resistivity for the values of those "instantiation slots".
@END(Comment)